On the use of social signal for reward shaping in reinforcement learning for dialogue management
نویسندگان
چکیده
This paper investigates the conditions under which social signals (facial expressions, postures, gazes, etc.), especially non-verbal multimodal user appraisal, can help to accelerate the learning capacity of a Reinforcement Learning (RL) agent in the dialogue management context. For this purpose a potential-based shaping reward method is used jointly with the Kalman Temporal Differences (KTD) framework so as to properly integrate the social aspects in an efficient optimization procedure through social-based additional reinforcement signals. Besides its general interest, this procedure could leverage system’s development by allowing the designer to teach its system through explicit signals at its early stage of training. Experiments carried out using the state-of-theart goal-oriented Hidden Information State (HIS) dialogue management framework in a simulation setup confirm the interest of the proposed approach.
منابع مشابه
Reward Shaping for Statistical Optimisation of Dialogue Management
This paper investigates the impact of reward shaping on a reinforcement learning-based spoken dialogue system’s learning. A diffuse reward function gives a reward after each transition between two dialogue states. A sparse function only gives a reward at the end of the dialogue. Reward shaping consists of learning a diffuse function without modifying the optimal policy compared to a sparse one....
متن کاملMulti-Objectivization in Reinforcement Learning
Multi-objectivization is the process of transforming a single objective problem into a multi-objective problem. Research in evolutionary optimization has demonstrated that the addition of objectives that are correlated with the original objective can make the resulting problem easier to solve compared to the original single-objective problem. In this paper we investigate the multi-objectivizati...
متن کاملReward Shaping with Recurrent Neural Networks for Speeding up On-Line Policy Learning in Spoken Dialogue Systems
Statistical spoken dialogue systems have the attractive property of being able to be optimised from data via interactions with real users. However in the reinforcement learning paradigm the dialogue manager (agent) often requires significant time to explore the state-action space to learn to behave in a desirable manner. This is a critical issue when the system is trained on-line with real user...
متن کاملAn Adaptive Learning Game for Autistic Children using Reinforcement Learning and Fuzzy Logic
This paper, presents an adapted serious game for rating social ability in children with autism spectrum disorder (ASD). The required measurements are obtained by challenges of the proposed serious game. The proposed serious game uses reinforcement learning concepts for being adaptive. It is based on fuzzy logic to evaluate the social ability level of the children with ASD. The game adapts itsel...
متن کاملReward-Balancing for Statistical Spoken Dialogue Systems using Multi-objective Reinforcement Learning
Reinforcement learning is widely used for dialogue policy optimization where the reward function often consists of more than one component, e.g., the dialogue success and the dialogue length. In this work, we propose a structured method for finding a good balance between these components by searching for the optimal reward component weighting. To render this search feasible, we use multi-object...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013